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SUMMAR Y & CON CL USIONS 

The prognostic technique for a power MOSFET presented 
in this paper is based on accelerated aging of MOSFET 
IRF520Npbf in a TO-220 package. The methodology utilizes 
thermal and power cycling to accelerate the life of the devices. 
The major failure mechanism for the stress conditions is die- 
attachment degradation, typical for discrete devices with lead- 
free solder die attachment. It has been determined that die- 
attach degradation results in an increase in ON-state resistance 
due to its dependence on junction temperature. Increasing 
resistance, thus, can be used as a precursor of failure for the 
die-attach failure mechanism under thermal stress. A feature 
based on normalized ON-resistance is computed from in-situ 
measurements of the electro-thermal response. An Extended 
Kalman filter is used as a model-based prognostics techniques 
based on the Bayesian tracking framework. 

The proposed prognostics technique reports on 

preliminary work that serves as a case study on the prediction 
of remaining life of power MOSFETs and builds upon the 
work presented in [1], The algorithm considered in this study 
had been used as prognostics algorithm in different 

applications and is regarded as suitable candidate for 

component level prognostics. This work attempts to further the 
validation of such algorithm by presenting it with real 

degradation data including measurements from real sensors, 
which include all the complications (noise, bias, etc.) that are 
regularly not captured on simulated degradation data. 

The algorithm is developed and tested on the accelerated 
aging test timescale. In real world operation, the timescale of 
the degradation process and therefore the RUL predictions 
will be considerable larger. It is hypothesized that even though 
the timescale will be larger, it remains constant through the 
degradation process and the algorithm and model would still 
apply under the slower degradation process. By using 
accelerated aging data with actual device measurements and 
real sensors (no simulated behavior), we are attempting to 
assess how such algorithm behaves under realistic conditions. 


1 INTRODUCTION 

Prognostics is an engineering discipline focused on 
predicting the time at which an in-service component will fail. 
The science of prognostics is based on the analysis of failure 
modes, detection of early signs of wear and aging, and fault 
conditions. These signs are then correlated with a damage 
propagation model and suitable prediction algorithms to arrive 
at a “remaining useful life” (RUL) estimate. The discipline 
that links studies of failure mechanisms to system lifecycle 
management is often referred to as prognostics and health 
management (PHM). Power semiconductor devices such as 
MOSFETs (Metal Oxide Field Effect Transistors) are essential 
components of electronic and electrical subsystems in on- 
board autonomous functions for vehicle controls, 
communications, navigation, and radar systems. In current 
practices, maintenance schedules are usually based on 
reliability data available from the manufacturer. However, 
while this approach works well in aggregate on a large number 
of components, failures on individual components are not 
necessarily averted. For mission critical systems it is 
extremely important to avoid such failures. This calls for 
condition based prognostic health management methods. 

1.1 Related Work 

In [2] a model-based prognostics approach for discrete 
IGBTs was presented. RUL predictions were accomplished 
using a particle filter algorithm where the collector-emitter 
leakage current was used as the primary precursor of failure. A 
prognostics approach for power MOSFETs was presented in 
[3], where, the threshold voltage was used as a precursor of 
failure; a particle filter was used in conjunction with an 
empirical degradation model. 

Identification of parameters that indicate precursors to 
failure in discrete power MOSFETs and IGBTs have received 
considerable attention in recent years. Several studies have 
focused on precursor of failure parameters for discrete IGBTs 
under thermal degradation due to power cycling overstress. In 


[4], collector-emitter voltage was identified as a health 
indicator; in [5], the maximum peak of the collector-emitter 
ringing at turn OFF transient was identified as the degradation 
variable; in [6] the switching turn-OFF time was recognized as 
failure precursor; and switching ringing was used in [7] to 
characterize degradation. For discrete power MOSFETs, ON- 
resistance was identified as a precursor of failure for the die- 
solder degradation failure mechanism [8, 9]. A shift in 
threshold voltage was identified as failure precursor due to 
gate structure degradation fault mode [10]. 

There have been some efforts in the development of 
degradation models that are a function of the usage/aging time 
based on accelerated life test. For example, empirical 
degradation models for model-based prognostics are presented 
in [2] and [3] for discrete IGBTs and power MOSFET 
respectively. Gate structure degradation modeling of discrete 
power MOSFETs under ion impurities has been presented in 
[ 11 ]. 

2 A CCELERA TED LIFE EXPERIMENTS 

The development of prognostics algorithms face similar 
constrains as reliability engineering in that both need 
information about failure events of critical electronics 
systems. These data are is rarely ever available. In addition, 
prognostics requires information about the degradation 
process leading to an irreversible failure; therefore, it is 
necessary to record in-situ measurements of key output 
variables and observable parameters in the accelerated aging 
process in order to develop and learn failure progression 
models. 

Thermal cycling overstress leads to thcrmo-mechanical 
stresses in electronics due to mismatch of the coefficient of 
thermal expansion between different elements in the 
component’s packaged structure. The accelerated aging 
applied to the devices presented in this work consists of 
thermal overstress. Latch-up, thermal run-away, or failure to 
turn ON due to loss of gate control are considered as failure 
conditions. Thermal cycles were induced by power cycling the 
devices without the use of an external heat sink. The device 
case temperature was measured and directly used as control 
variable for the thermal cycling application. For power 
cycling, the applied gate voltage was a square wave signal 
with an amplitude of ~15V, a frequency of lKHz and a duty 
cycle of 40%. The drain-source was biased at 4Vdc and a 
resistive load of 0.20 was used on the collector side output of 
the device. The aging system used for these experiments is 
described in [5], and the accelerated aging methodology is 
presented in [8]. 

In-situ measurements of the drain current ( I D ) and the 
drain to source voltage ( V DS ) are recorded as the device is 
under aging regime. The ON-state resistance Rds(on) in this 
application was computed as the ratio of Vos and I D on the 
ON-state of the square waveform. In the accelerated aging 
system, it is not possible to measure junction temperature 
directly, as a result, the increase in junction temperature is 
observed by monitoring the increase in Rds(on). Furthermore, 
junction temperature is also a function of the case temperature. 


which is measured and recorded in-situ. Therefore, the 
measured Rds(on) was normalized to eliminate the case 
temperature effects and reflect only changes due to 
degradation. Due to manufacturing variability, the pristine 
condition Rds(ON) varies from device to device. In order to take 
this into account, the normalized Rds(ON) time series is shifted 
by applying a bias factor representing the pristine condition 
value. The resulting trajectory (AR DS(0 n)) from pristine 
condition to failure, represents the degradation process due to 
die-attach failure and represents the increase in Rds(on) 
through the aging process. 

These measurements do not have a fixed sampling rate. 
On average, there is a transient response measurement every 
400 ns. This consists of a snapshot of the transient response 
which includes one full square waveform cycle. Therefore a 
resampling of the curve was carried out to have uniform 
sampling and a reduced sampling frequency on the failure 
precursor trajectory. The signals were filtered by computing 
the mean of every one minute long window. There are six 
available aged MOSFETs under thermal overstress. Figure 1 
presents the AR ds(0 n) trajectories for the six cases. 



Figure 1. hRostON) trajectories for all MOSFETs. 


3 DEGRADATION MODELING 

An empirical degradation model is suggested based on the 
degradation process observed on AR DS (oni for the six aged 
devices. It can be seen that this process grows exponentially as 
a function of time and that the exponential behavior starts at 
different points in time for different devices. An empirical 
degradation model can be used to model the degradation 
process when a physics-based degradation model is not 
available. This methodology has been used for prognostics of 
electrolytic capacitors using a Kalman filter [12]. There, the 
exponential degradation model was posed as a linear first- 
order discrete dynamic system in the form of a state-space 
model representing the dynamics of the degradation process. 
The proposed degradation model for the power MOSFET 
application is defined as follows. Let R = A R DS (on) be the 
increase in ON-resistance due to aging. 


R = a{e p t - l), (1) 

where t is time and a and ft are model parameters that could 
be static or estimated on-line as part of the Bayesian tracking 
framework. This model structure is capable of representing the 
exponential behavior of the degradation process for the 
different devices. Table 1 presents parameter estimation 
results for model (1) based on non-linear least-squares 
estimation. The estimate for both parameters is presented 
along with their corresponding sample variance. It is clearly 
observed that the parameters of the model will be different for 
different devices. Therefore, the parameters a and ft need to 
be estimated online in order to ensure accuracy. Figure 2 
presents the estimation results for device #36. 
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#09 
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2. 60x1 O' 4 

3.60xl0" 2 

1.64xl0‘ 9 

-4.71xl0" 9 

#36 

2.67xl0" 3 

1.31xl0' 2 

1.02xl0‘ 8 

2.99xl0‘ 8 


Table 1. Static parameter estimation results for degradation 
model in equation (1) applied to degradation data in Figure 1. 



Figure 2. Non-linear least squares for device #36. 


3.1 Dynamic degradation model for Bayesian tracking 

The degradation model presented in equation (1) is 
converted into a dynamic model in order to obtain the state- 
space representation needed for Bayesian tracking. Defining 
the parameters a and /? be time dependent parameters, then 
the derivative of (1) is given by, 

R = ae (/? + t /? ) + e^a — ct . (2) 

Defining a — 0 and fi = 0, the dynamic model 
representation is given by, 

R = (R + «)(/?), 

a = 0, (3) 

13 = 0. 


In this model, a and (1 are also state variables that change 
through time. Therefore, the model is a non-linear dynamic 
system and Bayesian tracking algorithms like the extended 
Kalman or particle filters are needed for on-line state 
estimation. The forward difference method is used to 
approximate the time derivatives in order to discretize the 
model in equation (3). The first step in the process is 

R(k + 1) -R(k) 

— { — = m) + a(fc)]/?(fc). (4) 

Solving for R(k + 1) and applying the method to a and /? 
we get: 

R(k + 1) = R(k ) + A t/? (k) [R (fc) + a(/c)], 

a(k + 1) = a(k), (5) 

fKk + l) = 0(fc). 

4 PROGNOSTICS ALGORITHM DEVELOPMENT 

A prognostics algorithm in this application predicts the 
remaining useful life of a particular power MOSFET device at 
different points in time through the accelerated life of the 
device. As indicated earlier, AR ds(0 n) is used in this study as a 
health indicator feature and as a precursor of failure. The 
prognostics problem is posed in the following way. 

• A single feature is used to assess the health state of the 
device (AR ds(0N) ). 

• It is assumed that the die-attached failure mechanism is 
the only active degradation during the accelerated aging 
experiment. 

• Furthermore, AR ds(0 n) accounts for the degradation 

progression from nominal condition through failure. 

• Periodic measurements with fixed sampling rate are 

available for AR DS (on)- 

• A crisp failure threshold of 0.045 in AR ds(0 n) is used. 

• The prognostics algorithm will make a prediction of the 

remaining useful life at time t p , using all the 

measurements up to this point either to estimate the health 
state at time t p in a Bayesian tracking framework. 

4. 1 Extended Kalman filter implementation 

Extended Kalman filter allows for the implementation of 
the Kalman filter algorithm for on-line estimation on non- 
linear dynamic systems [13, 14]. This algorithm has been used 
in other applications for health state estimation and 

prognostics. The general form of extended Kalman filter is 
given as; 

x(k + 1) = /(x(fc), u(k)) + w(/c), 
y(/c) = h(x(k)) + v(fc), 

where / and h are non-linear equations, w(k) is the model 
noise and i ?(k) is the measurement noise. Noise is considered 
to be normally distributed, with zero mean and known 
variance Q and R for w(k) and v(k') respectively. 

For the prognostics implementation using the discrete 
dynamic degradation model in equation (5), the state variable 
is defined as 


*(fc) = {*i00,* 2 ( fc )< *s( fe )} ,~ 

= {R(k),a(k),p(k)}. 

Therefore, /is a vector valued function given by equation 
(8). The ON-resistance is the only measured value; therefore, 
the measurement equation h is given by equation (9). 


/ = 


*i00 + a* 2 001*1 00 + *3 001 
*2 00 
*3 00 


h = x 3 (k) 


( 8 ) 

( 9 ) 


5 RUL ESTIMATION RESULTS 

This section presents the results of the algorithm 
implemented. Four test cases are defined as follows following 
the leave one out validation concept: 

• T 1 : Predict RUL on device #36, estimate initial conditions 
with the rest of the devices and compute RUL at times 
t p = [140,150,160,170,180,190,195,200,205,210] 

• T 2 : Predict RUL on device #09, estimate initial conditions 
with the rest of the devices and compute RUL at times 
t p = [140,150,160,170,180,190,195,200,205,210] 

• T 3 : Predict RUL on device #08, estimate initial conditions 
with the rest of the devices and compute RUL at times 
tp = [80,90,100,110,120,125,130,135,140] 

• T 4 : Predict RUL on device #14, estimate initial conditions 
with the rest of the devices and compute RUL at times 
tp = [80,90,100,110,120,125,130,135,140] 

RUL estimates are computed by subtracting the time 
when the prediction was made from the time when predicted R 
crosses the failure threshold. As more data becomes available, 
the predictions are expected to become more accurate and 
more precise. Table 2 presents the initial conditions for all the 
test cases. The initial conditions for the parameters and their 
corresponding variances are computed by taking the sample 
mean and sample standard deviation of training device 
parameters in Table 1. The initial value for R and its standard 
deviation, are computed by using the first ten data points in 
the training devices. 
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Table 2. Initial conditions for the state vector and its 
corresponding variance for all the test cases. 


Figure 3 and Figure 4 present the RUL estimation results 
for test cases 7/ and T 3 respectively. Analysis of the subplots 
from top to bottom shows how the prediction progresses as 
more data becomes available and the device gets closer to end 
of life. It also describes how prognostics consists of periodic 
RUL predictions through the life of the device. 



120 140 160 180 200 220 240 

Aging time (hr) 

Figure 3. Health state (AR DS (on>) tracking and RUL 
forecasting for test case T /. 
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Figure 4. Health state (AR DS (on>) tracking and RUL 
forecasting for test case T 3 . 

Table 3 and Table 4 present the state estimation results for 
ARds(on) and the forecasting of AR DS (on) after measurements 
are no longer available. Measurements are available up to time 
t p , these are use by the algorithm to adjust the estate 
estimation. The prediction portion starts after t p . An estimate 
of the expected value of RUL is presented along with the 
sample standard deviation. These values are computed by 
Monte Carlo simulation using the last available state estimate 
and the state transition equation in (8). The estimation error 
and relative accuracy (RA) are presented as performance 
metrics. RA is defined as 


RA d =' 



RUL* )' 


( 10 ) 


b 

T 

t 2 

RUL 

( a RUL ) 

Error 

RA 

RUL 

( a RUL) 

Error 

RA 

140 

49.545 

(1.180) 

30.583 

61.840 

61.707 

(1.598) 

26.212 

70.181 

150 

52.260 

(1.297) 

17.817 

74.600 

55.237 

(1.306) 

22.649 

70.928 

160 

50.275 

(1.249) 

9.862 

83.602 

46.184 

(1.011) 

21.744 

67.978 

170 

45.548 

(1.129) 

4.590 

90.847 

35.619 

(0.682) 

22.266 

61.546 

180 

37.342 

(1.001) 

2.802 

93.020 

27.040 

(0.548) 

20.857 

56.461 

190 

29.265 

(0.816) 

0.889 

97.052 

21.319 

(0.479) 

16.578 

56.263 

195 

21.336 

(0.646) 

3.757 

85.058 

19.548 

(0.466) 

13.339 

59.462 

200 

15.673 

(0.517) 

4.476 

77.781 

16.983 

(0.430) 

10.900 

60.936 

205 

10.953 

(0.427) 

4.198 

72.280 

14.169 

(0.389) 

8.730 

61.886 

210 

6.608 

(0.314) 

3.523 

65.272 

11.737 

(0.359) 

6.161 

65.587 


Table 3. RUL estimation results for test cases T 1 andT 2 . The 
mean RUL and the standard deviation are presented in the 
first column; error and relative accuracy are presented in the 
second and third column of each test case. 
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T, 

T 4 

RUL 
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Error 
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RUL 
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Error 

RA 

80 

56.166 

(1.461) 

6.379 

89.816 

56.180 

(1.391) 

7.162 

88.704 

90 

47.521 

(1.018) 

5.117 
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-0.262 

99.510 

100 

46.605 

(0.984) 

-4.001 

90.616 
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(1.520) 

-18.218 

58.026 

110 

54.881 

(1.237) 

-22.329 

31.573 

65.075 

(1.757) 

-31.798 

4.807 

120 

40.287 

(0.858) 

-17.631 

22.100 

47.704 

(1.045) 

-24.304 

-3.847 

125 

30.310 

(0.586) 

-12.666 

28.168 

36.258 

(0.781) 

-17.885 

2.818 

130 

21.614 

(0.402) 

-8.999 

28.766 

24.550 

(0.493) 

-11.160 

16.737 

135 

12.615 

(0.238) 

-4.996 

34.539 

13.499 

(0.272) 

-5.078 

39.577 

140 

5.795 

(0.141) 

-3.162 

-20.125 

6.599 

(0.168) 

-3.197 

6.081 


Table 4. RUL estimation results for test cases T 3 and T 4 . The 
mean RUL and the standard deviation are presented in the 
first column; error and relative accuracy are presented in the 
second and third column of each test case. 


The performance of the algorithm depends on the 
selection of the covariance matrix Q for the model noise w(/c) 
and the variance R for the measurement noise v(k). Their 
respective values have been used as tuning parameters for the 
algorithm. The covariance values were constant for all the 
tests cases. 
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